Local spectral variability features for speaker verification
نویسندگان
چکیده
Speaker verification techniques neglect the short-time variation in the feature space even though it contains speaker related attributes. We propose a simple method to capture and characterize this spectral variations through the eigenstructure of the sample covariance matrix. This covariance is computed using sliding window over spectral features. The newly formulated feature vectors representing local spectral variations are used with classical and state-of-the-art speaker recognition systems. Results on multiple speaker recognition evaluation corpora reveal that eigenvectors weighted with their normalized singular values are useful in representing local covariance information. We have also shown that local variability features can be extracted using mel frequency cepstral coefficients (MFCCs) as well as using three recently developed features: frequency domain linear prediction (FDLP), mean Hilbert envelope coefficients (MHECs) and power-normalized cepstral coefficients (PNCCs). Since information conveyed in the proposed feature is complementary to the standard short-term features, we apply different fusion techniques. We observe considerable relative improvements in speaker verification accuracy in combined mode on textindependent (NIST SRE) and text-dependent (RSR2015) speech corpora. We have obtained up to 12.28% relative improvement in speaker recognition accuracy on text-independent corpora. Conversely in experiments on text-dependent Corresponding Author Email addresses: [email protected] (Md Sahidullah), [email protected] (Tomi Kinnunen) Preprint submitted to Digital Signal Processing December 3, 2015 corpora, we have achieved up to 40% relative reduction in EER. To sum up, combining local covariance information with the traditional cepstral features holds promise as an additional speaker cue in both text-independent and textdependent recognition.
منابع مشابه
Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems
Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.
متن کاملUsing Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems
Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.
متن کاملForensic speaker verification using formant features and Gaussian mixture models
A new method for speaker verification based on formant features is presented. A UBM-GMM verification system is applied to semi-automatically extracted formant features. Speakerspecific vocal tract configurations, including the speakers’ variability, are incorporated in the speaker models. Speaker comparisons are expressed as likelihood ratios (the ratio of similarity to typicality). F1, F2 and ...
متن کاملA variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification
In this paper, we propose a spectral kurtosis based approach to extract features with a variable frame length and rate for speaker verification. Since the speaker-specific information of features in each frame changes depending upon the characteristics of speech, it is important to determine the appropriate frame length and rate to extract the salient feature frames. In order to distinctively r...
متن کاملApplication of shifted delta cepstral features in speaker verification
Recently, Shifted Delta Cepstral (SDC) feature was reported to produce superior performance to the delta and delta-delta features in cepstral feature based language identification (LID) systems [1, 2]. This paper examines the application of SDC features in speaker verification and evaluates its robustness to channel mismatch, manner of speaking and session variability. The result of the experim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Digital Signal Processing
دوره 50 شماره
صفحات -
تاریخ انتشار 2016